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Background: Accurate prediction of outcome for metastatic renal cell carcinoma (mRCC) patients receiving targeted therapy is 
essential. Most of the available models have been developed in patients treated with cytokines, while most of them are fairly 
complex, including at least five factors. We developed and externally validated a simple model for overall survival (OS) in mRCC. 
We also studied the recently validated International Database Consortium (IDC) model in our data sets. 

Methods: The development cohort included 170 mRCC patients treated with sunitinib. The final prognostic model was selected 
by uni- and multivariate Cox regression analyses. Risk groups were defined by the number of risk factors and by the 25th and 75th 
percentiles of the model's prognostic index distribution. The model was validated using an independent data set of 266 mRCC 
patients (validation cohort) treated with the same agent. 

Results: Eastern Co-operative Oncology Group (ECOG) performance status (PS), time from diagnosis of RCC and 
number of metastatic sites were included in the final model. Median OS of patients with 1, 2 and 3 risk factors were: 24.7, 12.8 
and 5.9 months, respectively, whereas median OS was not reached for patients with 0 risk factors. Concordance (Q index for 
internal validation was 0.712, whereas C-index for external validation was 0.634, due to differences in survival especially in poor-risk 
populations between the two cohorts. Predictive performance of the model was improved after recalibration. Application 
of the mRCC International Database Consortium (IDC) model resulted in a C-index of 0.574 in the development and 0.576 in the 
validation cohorts (lower than those recently reported for this model). Predictive ability was also improved after recalibration in this 
analysis. Risk stratification according to IDC model showed more similar outcomes across the development and validation cohorts 
compared with our model. 
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Conclusion: Our model provides a simple prognostic tool in mRCC patients treated with a targeted agent. It had similar 
performance with the IDC model, which, however, produced more consistent survival results across the development and 
validation cohorts. The predictive ability of both models was lower than that suggested by internal validation (our model) or recent 
published data (IDC model), due to differences between observed and predicted survival among intermediate and poor-risk 
patients. Our results highlight the importance of external validation and the need for further refinement of existing prognostic 
models. 



Renal cancer is the third most frequent malignancy of the urinary 
tract and accounts for 3% of all adult malignancies (Cohen and 
McGovern, 2005). Localised disease can be cured with surgery in 
most cases. Nevertheless, ~50% of patients with renal cell carci- 
noma will present with or develop metastatic disease (Flanigan et al, 
2004; Cohen and McGovern, 2005). In this case, prognosis remains 
poor and 5-year life expectancy is <20% (Kavolius et al, 1998; 
Flanigan et al, 2004). 

Recent advances in our understanding of the biology of RCC and 
especially the role of angiogenesis in the development and 
expansion of this tumour led to the development of novel vascular 
endothelial growth factor (VEGF) -targeting therapies (Motzer et al, 
2007; Escudier et al, 2007a, b), which proved to be superior to the 
previous standard, interferon (IFN). Sunitinib is an inhibitor 
of the split-kinase-domain family of receptor tyrosine kinases 
(including -VEGF) (Chow and Eckhardt, 2007). It has been 
established as first-line treatment for advanced RCC, following the 
results of a randomised phase III trial, which showed a significant 
advantage IFNa in progression-free survival (PFS) (Motzer et al, 
2007). In spite of this undisputed benefit, the prognosis of advanced 
RCC remains poor, while the toxicity of sunitinib (as well as that of 
other novel agents) is considerable (Bhojani et al, 2008). There is, 
therefore, a need to select patients likely to benefit from these 
therapies. 

In the era of targeted therapies, specific prognostic algorithms 
are necessary for clinical trial design, patients counselling and 
treatment decisions. Until recently, the most widely used 
prognostic model is that of the MSKCC, which uses five factors: 
LDH, Karnofsky performance status (KPS), time from nephrect- 
omy, calcium levels and haemoglobin levels, which have all been 
associated with independent prognostic significance (Motzer et al, 
1999, 2001; Negrier et al, 2005; Escudier et al, 2007c). The 
combination of these factors led to the development of a prognostic 
model including three patient groups with statistically significant 
and clinically relevant differences in survival (Motzer et al, 2001). 
The MSKCC model has been used for the design of all phase III 
trials using modern therapies. Nevertheless, there may be 
limitations associated with its use in this context. It was developed 
with patients undergoing treatment with cytokines, while the 
prognosis of patients with metastatic renal cell carcinoma (mRCC) 
with targeted therapies has been considerably improved. All 
randomized studies mainly included patients of low or inter- 
mediate risk, that is, populations with different composition than 
that of those used to develop the MSKCC model. Analyses in 
contemporary series have not confirmed all factors in the model as 
significant, while other factors with independent prognostic 
significance have been suggested (Motzer et al, 2009; Beuselinck 
et al, 2011; Karakiewicz et al, 2011). Furthermore, certain studies 
suggested that novel models may perform better than the MSKCC 
in the targeted therapy era (Karakiewicz et al, 2011). Finally, this 
model is fairly complex, requiring two clinical and three 
biochemical factors making its application for retrospective 
analyses somewhat problematic. Information regarding prognostic 
factors in the targeted therapy era is limited and heterogenous 
(Choueiri et al, 2007; Motzer et al, 2008; Karakiewicz et al, 2011). 
One (Motzer et al, 2008; Karakiewicz et al, 2011) or more 
(Choueiri et al, 2007) agents were used, while PFS (and not overall 



survival (OS)) was the endpoint (Choueiri et al, 2007; Motzer et al, 
2008; Karakiewicz et al, 2011). Finally, in two of these studies 
(Motzer et al, 2008; Karakiewicz et al, 2011) only patients 
participating in a clinical study were included, thus not being 
representative of the population treated in everyday practice. None 
of these models has been externally validated. For these reasons, 
their accuracy, performance characteristics and impact on clinical 
decisions remain unknown. 

The most accepted model developed in the targeted therapy era 
is the mRCC IDC model (Heng et al, 2009). The data have been 
derived by a large (645 patients) multinational database including 
patients treated with first-line anti-VEGF therapy. Six factors 
(KPS, time from nephrectomy, calcium, haemoglobin, neutrophils 
and platelets) were used to identify three prognostic groups. This 
model has been recently externally validated and compared 
favourably with four other models, which, however, had all been 
developed in the cytokine era (Heng et al, 2013). Nevertheless, the 
median follow-up was relatively short (16 months) and the 
model is more complex than that of MSKCC. In addition, 
treatment was heterogenous with three agents used. Importantly, 
one of them, sorafenib, can be considered as suboptimal first-line 
therapy, as, unlikely the other two agents, sunitinib and 
bevacizumab, it has not shown superiority to IFN in this setting 
(Escudier et al, 2009). 

We have recently used the advanced RCC database of the 
Hellenic Co-operative Oncology Group (HECOG) to study 
prognostic clinicopathological factors in patients treated with 
sunitinib. In an initial analysis of 109 patients, we identified PS, 
time from diagnosis and number of metastatic sites as independent 
predictors of survival (Bamias et al, 2010). The combination of these 
factors led to the development of a prognostic model with similar 
performance with that of the more complex MSKCC model. We are 
now reporting an updated analysis of 170 patients and external 
validation of this model. We also studied the performance of the 
IDC model in our development and validation cohorts. 



MATERIALS AND METHODS 



Patient population. The development cohort included 170 
consecutive patients with mRCC from nine Greek centres treated 
between October 2005 and December 2010. The validation cohort 
included 266 consecutive patients treated at three French and one 
Belgian centre between November 2005 and January 2012 
(Table 1). The larger part of this database has been used in a 
previous analysis of prognostic factors for PFS and OS (Beuselinck 
et al, 2011). The analysis was approved by the Institutional Review 
Board of the participating institutions and informed consent for 
the use of medical data for research purposes was obtained. 

Criteria for inclusion in this analysis included diagnosis of 
mRCC and treatment with sunitinib. Previous IFNa but not anti- 
VEGF therapy was allowed. Baseline demographic, clinical and 
laboratory data with prognostic significance according to published 
reports and the authors' experience were retrospectively collected 
from medical charts using uniform database templates to ensure 
consistent data collection. Overall survival data was available for all 
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Table 1 . Participating centres (number of patients, %) in the development 
and validation cohorts 



Centre 


N (%) 


Development cohort 


Alexandra 


66 (39) 


Papageorgiou 


40 (24) 


Other 9 


64 (37) 


Total 


170 (100) 


Validation cohort 


HEGP 


90 (34) 


KUL 


96 (36) 


IGR 


63 (24) 


Strasburg 


17 (6) 


Total 


266 (100) 


a Seven centres with fewer than 15 patients each. 



patients. The databases were updated in January 2012 before the 
final analysis. 



Statistical analysis. All analyses were carried out in STATA/SE 
11.2. 

Description of data and model construction. Patients' character- 
istics were presented through means, medians and proportions. 
Overall survival was the primary variable and was defined as the 
time interval between the date of first cycle of sunitinib and the 
date of death from any cause; patients not dead were censored at 
the date of last contact. Survival curves were estimated using the 
Kaplan-Meier method. Factors that were considered for their 
prognostic ability included: age ( ^ 60 vs > 60 years), sex (female vs 
male), Eastern Co-operative Oncology Group (ECOG) PS ( ^ 1 vs 
0), time from diagnosis to treatment with sunitinib in months 
(^12 vs >12), number of metastatic sites (>2 vs 0-2), tumour 
grade (III + IV vs I + II), nephrectomy (no vs yes), previous IFNa 
(no vs yes), histology (clear cell vs other), alkaline phosphatase 
(abnormal vs normal), LDH (abnormal vs normal), calcium 
(>10mgdl _1 vs ^lOmgdl" 1 ), platelets (>400 x 10 3 per mm 3 
vs ^400 x 10 3 per mm 3 ), haemoglobin (^13gdl _1 for males or 
^ 1 1.5 g dl ~ 1 for females vs > 13 g dl ~ 1 for males or > 1 1.5 g dl ~ 1 
for females), neutrophils (>5000 per mm 3 vs ^5000 per mm 3 ), 
WBC (>10 000 per mm 3 vs ^10 000 per mm 3 ), liver metastases 
(yes vs no), brain metastases (yes vs no), bone metastases (yes vs 
no) and lung metastases (yes vs no). 

For some patients in the development cohort, laboratory data 
(platelets, neutrophils, WBC, haemoglobin, calcium, LDH, ALP) 
and data on other variables (tumour grade, age, histology, liver 
metastases, lung metastases, bone metastases and brain metastases) 
were missing. To account for the missing values, we employed 
multiple imputations using the Markov Chain Monte Carlo 
method for arbitrary missing data. The variables used to generate 
imputed data were number of metastatic sites, PS, time from 
tumour diagnosis, sex, previous IFNa, previous nephrectomy and 
survival status. 

The associations of each of the above indicated factors (after 
multiple imputations) with OS were assessed through hazard ratios 
estimated from univariate Cox proportional hazards models. 
Factors for which the hazard ratios were statistically significant 
at the level of significance 0.2 after multiple imputations were then 
included in a multivariate Cox proportional hazards model. The 
final predictive model included only those variables for which the 
corresponding estimated hazard ratios were statistically significant 
at the level of 5% (P<0.05). 



After the final model was defined, patients were classified into 
risk groups in two ways: four groups on the basis of the actual 
number of prognostic factors that remained in the final model; and 
three groups defined as good, intermediate and poor risk on the 
basis of the 25th and 75th percentiles of the model's prognostic 
index risk score distribution. The former classification is familiar in 
the clinical setting, whereas the latter methodology has been 
suggested in recent studies (Royston et al, 2010). 

Internal and external validation and calibration of the model. 
ROC curves and bootstrap-corrected Harrell's C-index were used 
to assess the model's discriminatory ability (Pencina and 
D'Agostino, 2004) in the development cohort (internal validation). 
The C-index was estimated by bootstrapping with 200 resamples to 
estimate an unbiased measure of the ability of our predictive model 
to discriminate among patients in the development cohort with 
respect to their death/survival. 

External validation was performed by calculating a risk score for 
each patient in the validation cohort using the prognostic factors 
and the respective Cox regression coefficients of the model as 
estimated in the development data set. Patients were stratified 
according to their risk of death in the same way as in the 
development cohort, but using the distribution of risk scores in 
the validation data set. The model's discriminatory ability in the 
validation cohort was checked with the C-index. 

The model was recalibrated using the method described by 
Miller and Hui (1991), as this applies in the Cox PH model. 
According to this method, the need to include in the model a slope 
for the prognostic index is checked, and if so, the recalibrated 
model is used to estimate survival probabilities for subjects in the 
validation cohort. 

The predictive ability of the model (with or without calibration) 
was checked by plotting the observed and the predicted survival 
curves for the indicated risk groups in the development and in the 
validation data sets. 



RESULTS 



Baseline characteristics of the development cohort. The baseline 
characteristics of the patients in the development and the 
validation cohorts who were included in the analyses are detailed 
in Table 2. Median follow-up for the development cohort was 
35.51 months and for the validation cohort 37.55 months. During 
follow-up, 103 patients of the development cohort (61%) and 151 
(57%) of the validation cohort died. The median OS in the 
development and validation data sets was 19.4 months (95% 
confidence interval (CI) 15.1-24.7) and 26.1 months (95% CI 20.4- 
31.9), respectively. Significantly more patients in the validation 
cohort had undergone nephrectomy, had previously received IFNa, 
had clear cell histology, were more than 12 months from diagnosis 
of RCC, had normal LDH and had more than two metastatic sites. 
In addition, more patients of the validation cohort were categorised 
in the favourable and intermediate risk groups according to the 
MSKCC and IDC models. 

Construction of the predictive model. Model selection on the 
basis of the development cohort is shown in Table 3. Male sex, PS 
0, > 12 months from diagnosis to sunitinib initiation, 0-2 
metastatic sites, previous nephrectomy, normal LDH, alkaline 
phosphatase and calcium and platelet count ^400.000, lack of 
anaemia and absence of bone, lung or brain metastases were 
associated with improved survival in univariate analysis. Our final 
model (shown in the last two columns of Table 3) included the 
three factors, which were found significant in multivariate analysis 
(in order of significance): number of metastatic sites, ECOG PS 
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Table 2. Baseline characteristics of patients in the development and validation cohorts 



Development data set (n = 170) " Validation data set (n= 266) 





Median OS 


n 


(%) 


Median OS 


n 


(%) 


P- value 


Sex 














0.891 


Female 


11.6 


45 


V-l) 


19.1 


70 

/z 


/07\ 

V-l) 




Male 


22.4 


1 25 


(73) 


29.3 


1 94 


(73) 




Nephrectomy 














<U.UU I 


NO 


7 O 
/ .7 


"37 


(zzj 


I U.o 


c 
0 


(zj 




Yes 


22.3 


1 33 


(78) 


26.5 


261 


(98) 




Previous IFNa 














<0.001 


No 


19.4 


154 


(91) 


31.7 


148 


(56) 




Yes 


OO Q 

ZZ.o 


1 6 


(9) 


OO o 
ZZ.Z 


118 


(44) 




Time from initial diagnosis to 














<0.001 


sunitinib therapy 
















^12 months 


14.2 


90 


(53) 


21 


90 


(34) 




> 12 months 


33.3 


80 


(47) 


28.8 


176 


(66) 




Histology 














<0.001 


Clear Cell 


19.4 


149 


(87) 


26.9 


256 


(96) 




Other 


19.8 


20 


(12) 


7.9 


10 


(4) 




Missing 




1 


(D 




0 


(0) 




Tumour grade 














0.199 


| 


NR 


3 


(2) 


29.3 


2 


(1) 




|| 


28.8 


44 


(26) 


40.2 


53 


(20) 




III 


16.4 


62 


(36) 


29 


113 


(43) 




IV 


17.2 


27 


(16) 


17.3 


55 


(21) 




Missing 




o4 


/on\ 




A Q 


(1 6) 




Neutrophils status 














0.615 


^5000 


24.7 


66 


(39) 


34.4 


130 


(49) 




>5000 


15.1 


64 


(38) 


18.8 


113 


(43) 




Missing 




40 


(23) 




23 


(9) 




Platelets status 














0.595 


<400 


22.3 


118 


(69) 


29 


217 


(82) 




>400 


11.2 


27 


(16) 


13.7 


43 


(16) 




Missing 




tit 
ZD 


\ I 




A 
O 


Izj 




Karnofsky performance status 














0.086 


0 


36.7 


96 


(57) 


37.2 


160 


(60) 




1 


16.2 


53 


(31) 


18.6 


91 


(34) 




o 
z 


A A 
6.4 


1 A 
I O 


(V) 


L 

o 


I U 


IA\ 




3 


3.4 


5 


(3) 


1 3.6 


5 


(2) 




Total number of metastatic sites 














<0.001 


0-2 


29 


123 


(72) 


33 


131 


(49) 




> z 


7 O 

/.V 


47 


(zoj 


inn 
1 O.O 


I OO 


(5 I J 




LDH 














<0.001 


Normal 


29.2 


74 


(44) 


26.1 


214 


(81) 




Abnormal 


13.9 


50 


(29) 


23.6 


35 


(13) 




Missing 




A A 
40 


V-l) 




1 7 
I / 


IA\ 
\P) 




Calcium 














0.537 


<10 


20.8 


102 


(60) 


24.9 


214 


(81) 




>10 


9.8 


18 


(10) 


26.9 


31 


(12) 




Missing 




50 


(30) 




21 


(8) 




Haemoglobin 














0.309 


^13 for males ^11.5 for females 


12 


69 


(41) 


17.9 


108 


(41) 




> 1 3 for males > 1 1 .5 for females 


29 


77 


(45) 


30 


149 


(56) 




Missing 




24 


(14) 




9 


(3) 




Heng's risk classification 














<0.001 


Fa\/ni irah o 
rd V KJ\J f akjlc 


37.4 


16 


(9) 


40.2 


53 


(70) 




Intermediate 


28.8 


57 


(34) 


21 


126 


(47) 




Poor 


11.2 


32 


(19) 


13.6 


48 


(18) 




Missing 




65 


(38) 




39 


(15) 




Motzer's risk classification 














<0.001 


Favourable 


NR 


16 


(9) 


38.1 


59 


(22) 




Intermediate 


29.2 


61 


(36) 


21 


134 


(50) 




Poor 


11.6 


37 


(22) 


13.6 


44 


(17) 




Missing 




56 


(33) 




29 


(11) 




Abbreviations: IFN = interferon; LDH = 


actate dehydrogenase; NR = not reached; OS = overall survival. 
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0 1 2 3 4 5 6 

Time since therapy initiation (years) 




0 1 2 3 4 5 6 

Time since therapy initiation (years) 



Figure 1. Observed (solid lines) and predicted (dashed lines) overall 
survival for the development data set by risk classification according 

to: (A) number of risk factors and (B) according to percentiles of the 
prognostic index. 

and time from diagnosis to sunitinib. Check of the proportionality 
assumption revealed no violation (P- value = 0.392). 

Risk stratification in the development cohort. The prognostic 
index from the model was estimated for each patient as the 
sum of the variables included in the final model multiplied 
by the log of the respective HRs (Table 3). Low values of the index 
indicate lower probability of death. Patients were classified 
according to their risk of death in four groups identified 
by the number of risk factors of the final predictive model that 
were present in a patient: no factors, any one factor, any two 
factors and all three factors; and in three groups 
(good, intermediate and poor risk) by splitting the index values 
at 0 (25th percentile) and 1.544 (75th percentile). Figure 1A and B 
(solid lines) show the observed survival curves according to either 
classification scheme. 

Internal validation. Internal validation of the model regarding its 
discriminatory ability resulted in a C-index of 0.709. After 
bootstrapping with 200 resamples, the corrected C-index was 
0.712 indicating good discriminatory performance of the model, in 
that subjects with longer predicted survival times also had longer 
actual survival. The discriminatory ability of the model is also 
demonstrated in Figure 1A and B (dashed lines): predicted survival 
curves are very close to observed survival curves. 

External validation. External validation was accomplished by 
applying the log of the HRs shown in Table 3 (7th column) to each 



patient in the validation data set to calculate the prognostic 
index. Risk groups were formed as in the development data 
set but the cutoffs for the 25th (0) and 75th (1.543) percentile 
were derived from the distribution of the prognostic index in the 
validation data set. The C-index from this model in the 
validation data set was 0.634, indicating that the model did not 
have as high discriminatory ability in the validation as in the 
development cohort. 

In Figure 2A and B, the observed and predicted survival curves 
according to the two classification schemes are shown for the 
validation data set. Predicted survival was similar to the observed 
in the best prognosis groups of both classifications but deviated in 
the other risk groups with longer observed survival being longer 
than the predicted. This is also evident in Figure 3A and B where 
the observed survival is plotted for the development (solid line) 
and validation data sets (dashes lines). 

When the model was recalibrated, the inclusion of a slope 
for the prognostic index was deemed statistically significant 
(P< 0.001) and the magnitude of the slope was 0.553 
(s.e. = 0.117). Therefore, the calibrated prognostic index was 
further used to predict survival in the validation data set. The 
improvement in the prediction of the model is depicted in 
Figure 2C and D, where the deviation between observed and 
predicted survival in the validation data set has been decreased 
compared with the prior-to-calibration analysis. Table 4 provides a 
summary of survival according to risk classification in the 
development and validation cohorts (after recalibration). 

As our model showed lower discriminatory ability in the 
validation compared with the development cohort, we also 
evaluated the performance of the model proposed by IDC, in the 
development and validation cohorts. In this way, the IDC model 
was indirectly compared with our predictive model. That is, we 
estimated a prognostic index on the basis of parameters and 
respective estimates given by Heng et al (2009), and we estimated 
predicted survival on the basis of this prognostic index in both data 
sets. Eastern Co-operative Oncology Group (ECOG) PS was 
converted to KPS by considering KPS of 100 equal to ECOG PS of 
0, KPS of 80-90 equal to ECOG PS of 1, and KPS ^70 equal to 
ECOG PS ^2. Risk groups were formed on the basis of the 
following six factors: PS, time from nephrectomy, calcium level, 
haemoglobin level, neutrophil count and platelet count (cate- 
gorised as shown in Table 3) according to the published model 
(Heng et al, 2009): favourable risk 0 factors, intermediate risk 1-2, 
poor risk 3-6. C-index was 0.574 in the development and 0.576 in 
the validation data sets for the IDC model. This modest 
discrimination ability was attributed to the worse predicted 
compared with observed survival in both the development 
(Figure 4A) and validation (Figure 4B) data sets. When IDC 
model was recalibrated, a slope of 0.555 (P- value < 0.001) for the 
development and a slope of 0.580 (P- value < 0.001) were estimated 
for the validation cohorts. Predicted survival was much closer to 
observed after recalibration of the IDC model especially in the 
validation cohort (Figure 4C and D). An overall evaluation of IDC 
model with respect to survival in the development and validation 
cohorts is shown in Table 4 after calibration. It should be noted 
that, in contrast, to our model, survival of each risk group 
according to IDC classification was quite similar between the 
development and the validation cohorts. We also compared the 
predictive performance, in the development and the validation data 
sets, of our model with that proposed by IDC (Heng et al, 2009), as 
well as, with that proposed by the MSKCC (Motzer et al, 2001), 
using ROC curves (data not shown). No statistically significant 
differences were seen in the validation data set, whereas our model 
performed better in the development data set — this was somehow 
expected, as our model was derived on the basis of the highest 
predictive ability with respect to survival of subjects in the 
development data set. 
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Table 3. Model selection through univariate and multivariate Cox models in the development cohort 



Univariate 3 Multivariate 3 Multivariate 3 



Variables 


N 


HR (95% CI) 


P- value 


HR (95% CI) 


P- value 


HR (95% CI) 


P- value 


Sex 


170 














Male 
Female 




1 

1.70 (1.12-2.58) 


0.013 


1 

1.20 (0.71-2.03) 


0.490 






Performance status 


170 














0 




1 

2.55 (1.73-3.77) 


<0.001 


1 

1.83 (1.12-3.01) 


0.016 


1 

2.06 (1.38-3.08) 


<0.001 


Time diagnosis to sunitinib therapy 


170 














> 1 2 months 
1 2 months 




1 

2.19 (1.45-3.29) 


<0.001 


1 

1.56 (0.97-2.51) 


0.066 


1 

1.71 (1.12-2.59) 


0.013 


No. of metastatic sites 
0-2 

>2 


170 


1 

3.46 (2.32-5.16) 


<0.001 


1 

2.53 (1.24-5.17) 


0.011 


1 

2.75 (1.82-4.15) 


<0.001 


Tumour grade 


136 














l + ll 
III + IV 




1 

1.38 (0.87-2.18) 


0.171 


1 

1.44 (0.80-2.57) 


0.219 






Nephrectomy 


170 














Yes 
No 




1 

2.19 (1.42-3.37) 


<0.001 


1 

1.02 (0.56-1.86) 


0.960 






LDH 


124 














Normal 
Abnormal 




1 

1.58 (1.03-2.43) 


0.036 


1 

1.01 (0.57-1.80) 


0.963 






ALP 


125 














Normal 
Abnormal 




1 

2.12 (1.35-3.33) 


0.001 


1 

1.18 (0.64-2.20) 


0.594 






Ca 


120 














Normal 
Abnormal 




1 

1.60 (0.86-2.99) 


0.137 


1 

0.91 (0.42-1.99) 


0.815 






Neutrophils 


130 














^5000 
>5000 




1 

1.22 (0.79-1.88) 


0.381 










WBC 


146 














^10000 
> 1 0 000 




1 

1.22 (0.70-2.11) 


0.487 










Platelets 


145 














^400 
>400 




1 

1.81 (1.12-2.92) 


0.015 


1 

1.06 (0.59-1.88) 


0.851 






Histology 


169 














Other 
Clear cell 




1 

1.10 (0.59-2.05) 


0.772 










Age 


147 














>60 
^60 




1 

1.21 (0.79-1.84) 


0.386 










Previous IFNa 


170 














Yes 
No 




1 

1.10 (0.60-2.01) 


0.765 










Hb 


146 














> 13 for males 

> 1 1 .5 for females 
^13 for males 

< 1 1 .5 for females 




1 

1.92 (1.27-2.91) 


0.002 


1 

1.53 (0.88-2.68) 


0.134 






Brain metastasis 


167 














No 
Yes 




1 

2.78 (1.41-5.50) 


0.003 


1 

2.16 (0.98-4.76) 


0.057 






Liver metastasis 


168 














No 
Yes 




1 

1.59 (0.90-2.80) 


0.111 


1 

1.20 (0.63-2.31) 


0.580 






Bone metastasis 


169 














No 
Yes 




1 

1.73 (1.15-2.61) 


0.009 


1 

0.88 (0.50-1.57) 


0.666 






Lung 


169 














No 
Yes 




1 

1.67 (1.09-2.55) 


0.019 


1 

1.06 (0.63-1.78) 


0.831 






Abbreviations: ALP = alkaline phophatase; CI = confidence interval; HR = hazard ratio; IFN = interferon; LDH = lactate dehydrogenase; NR = not reached; OS = overall survival; 
WBC = white blood cells. 

a Multiple imputation was applied for missing values in variables platelets, neutrophils, age, haemoglobin, CA, LDH, ALP, tumour grade, WBC, histology, liver metastasis, lung metastasis, bone 
metastasis, brain metastasis, which were selected for the multivariate model using backward selection. 
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Figure 2. Observed (solid lines) and predicted (dashed lines) overall survival by risk classification according to the number of risk factors (A, C) or 
the percentiles of the prognostic index (B, D) for the validation data set before (A, B) and after (C, D) recalibration. 



Table 4. Survival according to risk stratification in the development and validation cohorts 



Development data set 



Validation data set 



m 





Deaths/No. 


Median 


Median 


Hazard 


Deaths/No. 


Median 


Median 


Hazard 


Prognostic groups 


of patients 


observed OS 


predicted OS 


ratios 


of patients 


observed OS 


predicted OS 


ratios 


Our model (after calibration in the validation cohort) 


0 risk factors 


15/48 


NR 


NR 


1 


31/67 


38.1 


50.2 


1 


1 risk factors 


33/55 


24.7 


22.3 


2.27 


45/92 


30 


29 


1.26 


2 risk factors 


33/45 


12.8 


11.5 


4.32 


54/82 


20.4 


19.2 


1.88 


3 risk factors 


22/22 


5.9 


6.4 


10.48 


21/25 


10.6 


13.6 


4.09 


Good risk 


15/48 


NR 


NR 


1 


31/67 


38.1 


50.2 


1 


Intermediate risk 


46/78 


22.4 


21.7 


2.41 


70/135 


29 


26 


1.36 


Poor risk 


42/44 


7.7 


8.6 


8.18 


50/64 


13.5 


17.2 


2.78 


Heng # s model (after calibration in both cohorts) 


Favourable 


6/16 


37.4 


37.4 


1 


24/53 


40.2 


43.2 


1 


Intermediate 


28/57 


28.8 


24.8 


1.62 


77/126 


21 


21.4 


1.73 


Poor 


26/32 


11.2 


11.2 


3.64 


37/48 


13.6 


13.1 


2.79 


Abbreviations: NR = not reached; OS = overall survival. 



DISCUSSION 



An ideal prognostic model should be easy to use, include only the 
most relevant patient and disease characteristics and accurately 



distinguish patient groups with different prognosis. Our model 
fully meets the first two criteria and has satisfactory discriminatory 
ability, although there is room for improvement. 

Other prognostic models for mRCC have been previously 
proposed by the MSKCC (Motzer et al, 2001), the Cleveland Clinic 
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data set 
a set 



0 1 2 3 4 5 6 

Time since therapy initiation (years) 




H 1 1 1 1 1 1- 

0 1 2 3 4 5 6 

Time since therapy initiation (years) 

Figure 3. Observed overall survival for the development (solid lines) 
and validation data sets (dashed lines) by risk classification according 
to number of risk factors (A) or percentiles of the prognostic index (B). 

Foundation (Choueiri et al, 2007), French investigators (Negrier 
et al, 2005), the International Kidney Cancer Working Group 
(IKCWG) (Manola et al, 2011) and IDC (Heng et al 2009). All 
these models, except from the IDC, are on the basis of outcomes of 
patients treated with immunotherapy or on single-institution 
experiences and have not always been externally validated. Our 
model is simpler, including only three clinical factors, usually 
readily available for every patient with mRCC. This is confirmed by 
the fact that this information was available for almost all patients in 
the validation cohort, although patients were not selected on the 
basis of the availability of such information. Among the three 
prognostic factors included in our model, PS and time from 
diagnosis have been consistently found significant in all relevant 
studies both in cytokine and targeted therapy era (Motzer et al, 
1999, 2001; Negrier et al, 2005; Choueiri et al, 2007; Escudier et al, 
2007c; Motzer et al, 2008; Heng et al, 2009; Motzer et al, 2009; 
Beuselinck et al, 2011; Karakiewicz et al, 2011), while number of 
metastatic sites has been shown to be an independent prognostic 
factor in several mRCC series (Negrier et al, 2005; Escudier et al, 
2007c; Manola et al, 2011; Poprach et al, 2012). Our patients were 
homogenously treated with sunitinib, which is one of the most 
active agents in mRCC (Patard et al, 2011), the follow-up is among 
the longest reported in studies with targeted therapies (Heng et al, 
2009; Beuselinck et al, 2011; Karakiewicz et al, 2011; Heng et al, 
2013) and most patients were not included in clinical trials, thus 
making it applicable in everyday practice. Internal validation 
showed good discriminatory ability with a C-index of 0.712, similar 
to that reported for the IDC model (Heng et al, 2009). 



Two methods for risk stratification were used. No superiority of 
one over the other was found. We believe that stratification 
according to the number of risk factors maintains the simplicity of 
the model and is more easily applicable in a clinical setting. This 
classification clearly identifies a group of poor prognosis (three risk 
factors), which does not seem to benefit from sunitinib therapy 
(median OS, 5.9 months). Such poor-outcome group has not been 
identified by previous studies and represents an advantage of the 
proposed model. 

External validation yielded a C-index of 0.634, which is lower 
than that yielded by internal validation. Nevertheless, it is 
comparable to that reported for other published models, when 
studied in independent data sets (Heng et al, 2013). The less 
optimal performance was mainly found in the groups with the 
inferior prognosis, where the observed survival was better that the 
predicted by our model, although these differences were amelio- 
rated with calibration. The reasons for this discrepancy are 
obscure. There were imbalances between the two cohorts in 
certain baseline characteristics as well as in the distribution across 
MSKCC and IDC risk groups in favour of the validation cohort. 
This is not infrequent, and has also been reported in other similar 
studies (Kang et al, 2012; Poprach et al, 2012; Yi et al, 2012). 
Considering the imbalanced factors, time from diagnosis and 
number of metastatic sites have been included into the final model 
and, therefore, their imbalance has been accounted for. As most 
patients in the validation cohort had undergone nephrectomy, 
separate validation studies, including only nephrectomised 
patients, were performed. This did not significantly improve our 
results (data not shown). The other three imbalanced factors, that 
is, previous IFN, non-clear histology and LDH were not further 
investigated. Previous IFNa and histology were not found to be 
significant in univariate analysis, while LDH was not available in 
27% of our patients, limiting the power of further analyses. Other 
factors may have also affected our results. It has recently been 
suggested that eligible-for-studies patients may have different 
outcomes than non-eligible patients (Heng et al, 2012), while 
survival in expanded access programs (EAPs) for sunitinib has 
been lower than that of the randomised study (Gore et al, 2009; 
Motzer et al, 2009). Most patients from the French centres had 
been included in clinical trials in contrast to Greek and Belgian 
patients. Although not all our patients would be ineligible for trials, 
median survival of our cohort resembled that of the EAP, while 
median OS of the validation cohort approximated that of the 
randomized study. Inclusion in clinical studies may affect outcome 
through more thorough tumour evaluation and follow-up. This 
may be particularly true for the detection of metastatic sites. For 
these reasons, we performed additional analyses using only Belgian 
patients as the validation cohort and also using a model with only 
PS and time from diagnosis. These analyses did not result in better 
performance of our model (data not shown) but the relatively small 
numbers included in these subgroups may limit these analyses. 

Among the previously developed models, that proposed by IDC 
(Heng et al, 2009) is rapidly gaining acceptance, as it is the only 
one developed with patients treated with targeted therapies, has 
been externally validated and seems to have higher stratification 
capability than the others (Heng et al, 2013). We, therefore, 
attempted to validate this model in our two independent, 
homogenously treated, non- selected populations. Median OS of 
the IDC risk groups was fairly similar between development and 
validation cohorts, which is an improvement over our model and 
supports its applicability in mRCC patients. Nevertheless, in both 
cohorts, C-index was below 0.6, lower than the 0.634 of our model 
and that of 0.664, yielded by the external validation procedure for 
IDC (Heng et al, 2013). Again, the most notable deviation of the 
predicted from the observed survival was found in the poor-risk 
groups, where median OS was higher (13.6 and 11.2 months) than 
the reported 7.8 months (Heng et al, 2013). The latter could be, at 



www.bjcancer.com | DOI:1 0.1 038/bjc.201 3.341 



339 



BRITISH JOURNAL OF CANCER 



Prognostic model in sunitinib-treated mRCC 





H i i i i i i- H 1 1 1 1 1 1 

0123456 0123456 

Time since therapy initiation (years) Time since therapy initiation (years) 

Figure 4. Observed (solid lines) and predicted (dashed lines) overall survival by risk classification according to the number of risk factors using 
the model proposed by mRCC IDC in the development (A, C) or the validation (B, D) data sets before (A, B) and after (C, D) recalibration. 



least partially, attributed to the fact that l/3rd of those patients 
received sorafenib as first-line treatment, which is considered 
inferior to sunitinib. Nevertheless, sorafenib-teated patients had 
similar OS, while there was no available data regarding the 
treatment of the poor-risk group. The less satisfactory performance 
of both models in poor-risk patients, a group under-represented in 
clinical trials with targeted therapies, underlines the necessity for 
better characterisation of this group through more focused clinical 
research. In addition, the lower C-indices yielded by external 
validation for both models, compared with those by internal 
validation, underline the importance of external validation and the 
need for confirmation in multiple data sets before the wide 
acceptance of a proposed prognostic model. There may exist 
certain, yet unidentified, factors, which might affect outcome in 
mRCC patients treated with anti-VEGF therapies and might 
account for the limitations of the existing models. Recent data 
(Pena et al, 2010; Sun et al, 2011) suggest that the introduction of 
molecular factors may improve the performance of models relying 
purely on clinical factors. 

In conclusion, we externally validated a simple model, which 
could be used to stratify patients with mRCC offered sunitinib. 
Although we believe that it could be used for any type of anti- 
VEGF therapy, this remains to be confirmed. The predictive 
accuracy of this model appears comparable to that of the more 
complex IDC model and could, therefore, represent a valid 
alternative. Both models did not perform equally well in poor- 
risk populations, which suggests that further refinement in 
additional independent data sets may be appropriate. 
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